Cultural Tourism Route Optimization

  • Authored by: Uvini Wijesinghe
  • Duration: 10 Weeks
  • Level: Intermediate
  • Pre-requisite Skills: Python
Creating optimised cultural tourism routes in Melbourne involves integrating data from multiple sources, including public memorials, sculptures, artworks, fountains, monuments, and landmarks, along with key transport infrastructure such as City Circle tram stops and Melbourne Visitor Shuttle bus stops. By analysing pedestrian movement patterns, the objective is to design routes that maximise visitor engagement by guiding them through high-interest cultural sites while ensuring accessibility and efficiency.

User Story
  • Title: Optimised Cultural Tourism Routes in Melbourne
  • As a: Tourism Planner/City Developer
  • I want to: Integrate data from cultural landmarks, transport infrastructure (City Circle tram stops and Melbourne Visitor Shuttle bus stops), and pedestrian movement patterns to create optimised cultural tourism routes.
  • So that: Visitors can experience a diverse range of cultural sites efficiently, while being guided through high-traffic pedestrian zones and accessible transport hubs to maximise engagement with public artworks, fountains, and monuments.


Acceptance Criteria:

  1. All relevant public memorials, sculptures, artworks, fountains, monuments, and landmarks in Melbourne must be identified, mapped, and included in the dataset.

  2. Data for City Circle tram stops, Melbourne Visitor Shuttle bus stops, and pedestrian pathways must be included to ensure routes are accessible via public transport.

  3. High-footfall areas must be identified through pedestrian counting data to help determine the most popular areas and to adjust routes accordingly to optimise visitor engagement.

  4. Optimised routes should guide visitors through high-interest cultural sites while ensuring accessibility to transport hubs and high pedestrian traffic zones.

  5. Routes should cover the highest number of cultural landmarks while maintaining a smooth, logical flow for visitors.

  6. The system should provide suggestions for areas where new cultural landmarks, public artworks, or monuments could be developed to encourage visitor traffic in underutilised spaces.

  7. The final solution should have a user-friendly interface for tourists, displaying routes, landmarks, and transport stops in a clear and interactive map format.

In [3]:
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud

import folium
from folium.plugins import MarkerCluster

🚂 Train Routes

Train Routes¶

This dataset contains information about selected public transport routes in Victoria, Australia, specifically focusing on metropolitan train and tram services. Each record includes key details such as:

  • route_id: A unique identifier for each route.
  • agency_id: The identifier for the transport agency operating the route.
  • route_short_name: A short name used to represent the route (e.g., line name).
  • route_long_name: A longer description of the route, usually indicating its endpoints.
  • route_type: A numerical code representing the type of transport (e.g., 2 for rail services).
  • route_color: The designated colour used to visually represent the route on maps or signage (in hexadecimal format).
  • route_text_color: The colour of text displayed over the route colour for readability.
In [6]:
metro_train_routes = pd.read_csv("Datasets/gtfs/Metro Train/routes.txt", delimiter=",") 

# Split based on 'aus:vic:vic-' and take the second part
metro_train_routes['train_id'] = metro_train_routes['route_id'].str.extract(r'aus:vic:vic-(.*?):?$', expand=False)

metro_train_routes = metro_train_routes[['train_id', 'route_short_name', 'route_long_name']]

metro_train_routes = metro_train_routes.drop_duplicates()

metro_train_routes.head()
Out[6]:
train_id route_short_name route_long_name
0 02-ALM Alamein Alamein - City
1 02-BEG Belgrave Belgrave - City
2 02-CBE Cranbourne Cranbourne - City
3 02-CCL City Circle NaN
4 02-CGB Craigieburn Craigieburn - City

Train Stops¶

This dataset contains information about train station stops across Melbourne. Each record includes details such as the stop ID, stop name, latitude and longitude coordinates, location type, parent station, and platform code. The dataset allows for precise identification and mapping of station platforms, including major transport hubs like Flagstaff and Melbourne Central. It is useful for transport planning, route optimisation, and improving commuter accessibility across the metropolitan rail network.

All location data is provided in decimal degrees, which supports integration with mapping tools and geographic information systems (GIS).

In [8]:
metro_train_stops = pd.read_csv("Datasets/gtfs/Metro Train/stops.txt", delimiter=",")

metro_train_stops = metro_train_stops[['stop_id', 'stop_name', 'stop_lat','stop_lon']]

metro_train_stops = metro_train_stops.drop_duplicates()

metro_train_stops['stop_id'] = metro_train_stops['stop_id'].astype(str).str.strip()

metro_train_stops.head()
Out[8]:
stop_id stop_name stop_lat stop_lon
0 10117 Jordanville Station -37.873763 145.112473
1 10920 Flagstaff Station -37.811880 144.956043
2 10921 Flagstaff Station -37.811725 144.955968
3 10922 Melbourne Central Station -37.809974 144.962547
4 10923 Melbourne Central Station -37.809865 144.962516

Train Times¶

This dataset provides detailed information on scheduled public transport trips, including stop sequences for specific services. Each row represents a single stop on a trip and includes data such as the trip ID, arrival and departure times, stop ID, stop sequence, pickup and drop-off types, and the distance travelled along the route (in metres).
In [10]:
metro_train_times = pd.read_csv("Datasets/gtfs/Metro Train/stop_times.txt", delimiter=",", dtype={'stop_headsign': str})

metro_train_times['train_id'] = metro_train_times['trip_id'].str.extract(r'(^[^-]+-[^-]+)')

metro_train_times = metro_train_times[['trip_id', 'train_id', 'stop_id', 'stop_sequence']]

metro_train_times = metro_train_times.drop_duplicates()

metro_train_times['stop_id'] = metro_train_times['stop_id'].astype(str).str.strip()

metro_train_times.head()
Out[10]:
trip_id train_id stop_id stop_sequence
0 02-ALM--16-T2-2302 02-ALM 11197 1
1 02-ALM--16-T2-2302 02-ALM 11198 2
2 02-ALM--16-T2-2302 02-ALM 11200 3
3 02-ALM--16-T2-2302 02-ALM 11202 4
4 02-ALM--16-T2-2302 02-ALM 11203 5

Trip Ids with Highest Stop Count¶

In [12]:
# Find the highest stop_sequence for each train_id
highest_seq_per_train = metro_train_times.loc[
    metro_train_times.groupby('train_id')['stop_sequence'].idxmax(),
    ['train_id', 'trip_id', 'stop_sequence']
].rename(columns={'stop_sequence': 'max_sequence'})

# Get unique trip_ids
train_unique_trip_ids = highest_seq_per_train['trip_id'].unique()

# Filter metro_train_times for those trip_ids
filtered_metro_train_times = metro_train_times[metro_train_times['trip_id'].isin(train_unique_trip_ids)]
filtered_metro_train_times.head(5)
Out[12]:
trip_id train_id stop_id stop_sequence
2539 02-ALM--16-T5-2801 02-ALM 11213 1
2540 02-ALM--16-T5-2801 02-ALM 22189 2
2541 02-ALM--16-T5-2801 02-ALM 12196 3
2542 02-ALM--16-T5-2801 02-ALM 12198 4
2543 02-ALM--16-T5-2801 02-ALM 12200 5

Final Train Stops and Routes Dataset¶

In [14]:
filtered_metro_train_times = filtered_metro_train_times.copy()

# Now it's safe to modify
filtered_metro_train_times['stop_id'] = filtered_metro_train_times['stop_id'].astype(str).str.strip()
metro_train_stops['stop_id'] = metro_train_stops['stop_id'].astype(str).str.strip()

result_train1 = filtered_metro_train_times.merge(metro_train_stops, on='stop_id', how='left')
result_train1.head(2)
Out[14]:
trip_id train_id stop_id stop_sequence stop_name stop_lat stop_lon
0 02-ALM--16-T5-2801 02-ALM 11213 1 Flinders Street Station -37.818307 144.966010
1 02-ALM--16-T5-2801 02-ALM 22189 2 Southern Cross Station -37.818535 144.952144
In [15]:
result_train = result_train1.merge(metro_train_routes, on='train_id', how='left')
result_train = result_train[['trip_id', 'train_id', 'route_short_name', 'stop_id','stop_name', 'stop_lat', 'stop_lon']]
result_train.head(2)
Out[15]:
trip_id train_id route_short_name stop_id stop_name stop_lat stop_lon
0 02-ALM--16-T5-2801 02-ALM Alamein 11213 Flinders Street Station -37.818307 144.966010
1 02-ALM--16-T5-2801 02-ALM Alamein 22189 Southern Cross Station -37.818535 144.952144

🚌 Bus Routes

Bus Routes¶

This dataset outlines information about bus routes operated by a public transport agency. Each record corresponds to a unique route and includes attributes such as the route ID, agency ID, short and long route names, route type, and visual styling properties including route and text colours (in hexadecimal format).

The route_long_name field provides a clear description of the route’s start and end points, aiding in trip planning and wayfinding. The route_type value of 3 indicates these are bus services, and the consistent colour scheme supports visual uniformity in digital and printed transport maps.

In [18]:
metro_bus_routes = pd.read_csv("Datasets/gtfs/Metro Bus/routes.txt", delimiter=",") 
metro_bus_routes = metro_bus_routes[['route_short_name', 'route_long_name']]
metro_bus_routes.head(2)
Out[18]:
route_short_name route_long_name
0 831 Kingsmere Estate - Berwick Station
1 834 Berwick Station

Bus Stops¶

This dataset contains geographic and identifying information for public transport stops for buses across Melbourne. Each entry includes a unique stop ID, the stop name (with nearby street references and suburb), and its geographic coordinates (latitude and longitude in decimal degrees).

The data enables precise mapping of stop locations, supporting route planning, accessibility assessments, and integration with broader transport datasets. It is especially useful for visualising public transport coverage and identifying connectivity within and between suburbs.

Stop names are descriptive and commonly formatted as Street/Street (Suburb), providing clarity for users navigating the transport network.

In [20]:
metro_bus_stops = pd.read_csv("Datasets/gtfs/Metro Bus/stops.txt", delimiter=",")
metro_bus_stops.head(2)
Out[20]:
stop_id stop_name stop_lat stop_lon
0 1000 Dole Ave/Cheddar Rd (Reservoir) -37.700775 145.018951
1 10001 Rex St/Taylors Rd (Kings Park) -37.726975 144.776152

Bus Stop Times¶

This dataset provides detailed stop-level information for scheduled public transport trips in Melbourne. Each row represents a stop within a specific trip, identified by a unique trip_id. It includes the scheduled arrival and departure times, stop ID, the order of the stop in the route (stop_sequence), and the distance travelled along the route (shape_dist_traveled, in metres).

The pickup_type and drop_off_type columns specify how passengers can board or alight at each stop, with values indicating standard pickup and drop-off procedures. This data is vital for constructing accurate transport schedules, simulating travel behaviour, and enhancing the operational efficiency of public transport services.

In [22]:
metro_bus_times = pd.read_csv("Datasets/gtfs/Metro Bus/stop_times.txt", delimiter=",")

metro_bus_times = metro_bus_times[['trip_id', 'stop_id', 'stop_sequence']]

metro_bus_times['bus_number'] = metro_bus_times['trip_id'].str.split('-').str[1]

metro_bus_times.head()
Out[22]:
trip_id stop_id stop_sequence bus_number
0 43-477--1-MF1-1086914 6725 1 477
1 43-477--1-MF1-1086914 6726 2 477
2 43-477--1-MF1-1086914 9095 3 477
3 43-477--1-MF1-1086914 27586 4 477
4 43-477--1-MF1-1086914 27587 5 477

Trip Ids with Highest Stop Count¶

In [24]:
# Find the highest stop_sequence for each train_id
highest_seq_per_bus = metro_bus_times.loc[
    metro_bus_times.groupby('bus_number')['stop_sequence'].idxmax(),
    ['bus_number', 'trip_id', 'stop_sequence']
].rename(columns={'stop_sequence': 'max_sequence'})

# Get unique trip_ids
bus_unique_trip_ids = highest_seq_per_bus['trip_id'].unique()

# Filter metro_train_times for those trip_ids
filtered_metro_bus_times = metro_bus_times[metro_bus_times['trip_id'].isin(bus_unique_trip_ids)]
filtered_metro_bus_times.head(5)
Out[24]:
trip_id stop_id stop_sequence bus_number
2617 43-477--1-MF1-1091914 18850 1 477
2618 43-477--1-MF1-1091914 7253 2 477
2619 43-477--1-MF1-1091914 7254 3 477
2620 43-477--1-MF1-1091914 7255 4 477
2621 43-477--1-MF1-1091914 18772 5 477

Final Bus Stops and Routes Dataset¶

In [26]:
filtered_metro_bus_times = filtered_metro_bus_times.copy()

# Trim spaces and convert stop_id to string for consistency
filtered_metro_bus_times['stop_id'] = filtered_metro_bus_times['stop_id'].astype(str).str.strip()
metro_bus_stops['stop_id'] = metro_bus_stops['stop_id'].astype(str).str.strip()

result_bus1 = filtered_metro_bus_times.merge(metro_bus_stops, on='stop_id', how='left')
result_bus1.head(2)
Out[26]:
trip_id stop_id stop_sequence bus_number stop_name stop_lat stop_lon
0 43-477--1-MF1-1091914 18850 1 477 Moonee Ponds Interchange/Mt Alexander Rd (Moon... -37.766260 144.924447
1 43-477--1-MF1-1091914 7253 2 477 Park St/Mt Alexander Rd (Moonee Ponds) -37.761882 144.921515
In [27]:
result_bus = pd.merge(result_bus1, metro_bus_routes, how='left', left_on='bus_number', right_on='route_short_name')
result_bus = result_bus[['trip_id', 'stop_id', 'stop_sequence', 'bus_number', 'route_long_name', 'stop_name', 'stop_lat', 'stop_lon']]
result_bus.head(2)
Out[27]:
trip_id stop_id stop_sequence bus_number route_long_name stop_name stop_lat stop_lon
0 43-477--1-MF1-1091914 18850 1 477 Broadmeadows Station - Moonee Ponds Moonee Ponds Interchange/Mt Alexander Rd (Moon... -37.766260 144.924447
1 43-477--1-MF1-1091914 7253 2 477 Broadmeadows Station - Moonee Ponds Park St/Mt Alexander Rd (Moonee Ponds) -37.761882 144.921515

🚃 Tram Routes

Tram Routes¶

This dataset outlines tram routes operated in Victoria, Australia. Each row corresponds to a unique tram route and includes the following information: route_id (a unique identifier), agency_id (the transport agency operating the service), route_short_name (the tram number), and route_long_name (the full start-to-end route description).

The route_type is indicated as “0”, which represents tram services in accordance with GTFS (General Transit Feed Specification) standards. Additionally, each route is styled with a route_color and route_text_color to support visual clarity in mapping and user interface applications.

In [30]:
metro_tram_routes = pd.read_csv("Datasets/gtfs/Metro Tram/routes.txt", delimiter=",") 
metro_tram_routes = metro_tram_routes[['route_id', 'route_short_name', 'route_long_name']]
metro_tram_routes.head(2)
Out[30]:
route_id route_short_name route_long_name
0 aus:vic:vic-03-1: 1 South Melbourne Beach - East Coburg
1 aus:vic:vic-03-109: 109 Port Melbourne - Box Hill

Tram Stops¶

This dataset lists tram stop locations across Melbourne, Victoria. Each entry includes a unique stop_id, the stop_name (typically indicating the intersecting streets and suburb), and geographic coordinates (stop_lat and stop_lon) for mapping purposes.

The dataset is useful for identifying the exact location of tram stops, and can be integrated with route and trip data for route planning, navigation systems, and urban mobility analysis.

In [32]:
metro_tram_stops = pd.read_csv("Datasets/gtfs/Metro Tram/stops.txt", delimiter=",")
metro_tram_stops.head(2)
Out[32]:
stop_id stop_name stop_lat stop_lon
0 10311 45-Glenferrie Rd/Wattletree Rd (Malvern) -37.862455 145.028508
1 10371 44-Duncraig Ave/Wattletree Rd (Armadale) -37.862069 145.025382

Tram Stop Times¶

This dataset provides tram trip scheduling information for route 109 in Melbourne. Each record corresponds to a specific trip_id and includes details such as arrival_time, departure_time, the associated stop_id, the stop's sequence in the route (stop_sequence), and the cumulative shape_dist_traveled (in metres) from the start of the trip.

In [34]:
metro_tram_times = pd.read_csv("Datasets/gtfs/Metro Tram/stop_times.txt", delimiter=",")

metro_tram_times = metro_tram_times[['trip_id', 'stop_id', 'stop_sequence']]

metro_tram_times['tram_number'] = metro_tram_times['trip_id'].str.split('-').str[1]

metro_tram_times.head()
Out[34]:
trip_id stop_id stop_sequence tram_number
0 03-109--1-T2-129962370 19781 1 109
1 03-109--1-T2-129962370 19782 2 109
2 03-109--1-T2-129962370 19783 3 109
3 03-109--1-T2-129962370 19784 4 109
4 03-109--1-T2-129962370 19785 5 109

Trip Ids with Highest Stop Count¶

In [36]:
# Find the highest stop_sequence for each train_id
highest_seq_per_tram = metro_tram_times.loc[
    metro_tram_times.groupby('tram_number')['stop_sequence'].idxmax(),
    ['tram_number', 'trip_id', 'stop_sequence']
].rename(columns={'stop_sequence': 'max_sequence'})

# Get unique trip_ids
tram_unique_trip_ids = highest_seq_per_tram['trip_id'].unique()

# Filter metro_train_times for those trip_ids
filtered_metro_tram_times = metro_tram_times[metro_tram_times['trip_id'].isin(tram_unique_trip_ids)]
filtered_metro_tram_times.head(5)
Out[36]:
trip_id stop_id stop_sequence tram_number
5602 03-109--1-T2-129963278 19725 1 109
5603 03-109--1-T2-129963278 19372 2 109
5604 03-109--1-T2-129963278 19371 3 109
5605 03-109--1-T2-129963278 19370 4 109
5606 03-109--1-T2-129963278 19369 5 109

Final Tram Stops and Routes Dataset¶

In [38]:
filtered_metro_tram_times = filtered_metro_tram_times.copy()

# Trim spaces and convert stop_id to string for consistency
filtered_metro_tram_times['stop_id'] = filtered_metro_tram_times['stop_id'].astype(str).str.strip()
metro_tram_stops['stop_id'] = metro_tram_stops['stop_id'].astype(str).str.strip()

result_tram1 = filtered_metro_tram_times.merge(metro_tram_stops, on='stop_id', how='left')
result_tram1.head(2)
Out[38]:
trip_id stop_id stop_sequence tram_number stop_name stop_lat stop_lon
0 03-109--1-T2-129963278 19725 1 109 129-Beacon Cove/Light Rail (Port Melbourne) -37.840789 144.932813
1 03-109--1-T2-129963278 19372 2 109 128-Graham St/Light Rail (Port Melbourne) -37.837054 144.937190
In [39]:
result_tram1['tram_number'] = result_tram1['tram_number'].astype('int64')
result_tram = pd.merge(result_tram1, metro_tram_routes, how='left', left_on='tram_number', right_on='route_short_name')
result_tram = result_tram[['trip_id', 'stop_id', 'stop_sequence', 'tram_number', 'route_long_name', 'stop_name', 'stop_lat', 'stop_lon']]
result_tram.head(2)
Out[39]:
trip_id stop_id stop_sequence tram_number route_long_name stop_name stop_lat stop_lon
0 03-109--1-T2-129963278 19725 1 109 Port Melbourne - Box Hill 129-Beacon Cove/Light Rail (Port Melbourne) -37.840789 144.932813
1 03-109--1-T2-129963278 19372 2 109 Port Melbourne - Box Hill 128-Graham St/Light Rail (Port Melbourne) -37.837054 144.937190

🚶🏻‍♂️Pedestrians


This dataset captures hourly pedestrian counts at various sensor locations across Melbourne. Each record includes a unique ID, the Location_ID of the sensor, the Sensing_Date, and the HourDay representing the hour of observation. Pedestrian flow is divided into Direction_1 and Direction_2, with their sum recorded as Total_of_Directions. Additional fields such as Sensor_Name and Location (latitude, longitude) help identify where the sensor is positioned.

Import and Clean Pedestrians dataset¶

In [42]:
ped_counts = pd.read_csv("Datasets/pedestrian-counting-system-monthly-counts-per-hour.csv")
ped_counts.head(2)
Out[42]:
ID Location_ID Sensing_Date HourDay Direction_1 Direction_2 Total_of_Directions Sensor_Name Location
0 371420221110 37 2022-11-10 14 77 90 167 Lyg260_T -37.80107122, 144.96704554
1 521220230401 52 2023-04-01 12 335 321 656 Eli263_T -37.81252157, 144.9619401
In [43]:
# Split 'Location' into separate Latitude and Longitude
ped_counts[['Latitude', 'Longitude']] = ped_counts['Location'].str.split(',', expand=True)

# Create an explicit copy of the selected columns to avoid SettingWithCopyWarning
date_counts = ped_counts[['Location_ID', 'Sensing_Date', 'Total_of_Directions', 'Sensor_Name', 'Latitude', 'Longitude']].copy()

# Convert Latitude and Longitude to float
date_counts['Latitude'] = date_counts['Latitude'].astype(float)
date_counts['Longitude'] = date_counts['Longitude'].astype(float)

# Display the new DataFrame
date_counts.head(5)
Out[43]:
Location_ID Sensing_Date Total_of_Directions Sensor_Name Latitude Longitude
0 37 2022-11-10 167 Lyg260_T -37.801071 144.967046
1 52 2023-04-01 656 Eli263_T -37.812522 144.961940
2 84 2022-03-30 1611 ElFi_T -37.817980 144.965034
3 54 2023-09-28 332 Swa607_T -37.804024 144.963084
4 61 2022-01-03 284 RMIT14_T -37.807675 144.963091
In [44]:
# Group by 'Sensor_Name' and sum 'Total_of_Directions'
sensor_count = date_counts.groupby(['Location_ID', 'Sensor_Name', 'Latitude', 'Longitude'], as_index=False)['Total_of_Directions'].sum()

# Display the result
sensor_count.head()
Out[44]:
Location_ID Sensor_Name Latitude Longitude Total_of_Directions
0 1 Bou292_T -37.813494 144.965153 17472047
1 2 Bou283_T -37.813807 144.965167 9253528
2 3 Swa295_T -37.811015 144.964295 21024227
3 4 Swa123_T -37.814880 144.966088 23849378
4 5 PriNW_T -37.818742 144.967877 18475133

Exploratory Data Analysis For Pedestrian Data¶


This interactive map visualises pedestrian traffic data across Melbourne using folium and MarkerCluster for efficient rendering. The base map is centred on the average coordinates of all sensor locations. Each sensor is represented by a circular bubble marker, with the size of the bubble scaled based on the total number of pedestrians recorded in both directions (Total_of_Directions).

Larger bubbles indicate higher pedestrian counts, allowing for quick identification of high-footfall areas. The markers include popups that display detailed information, such as the sensor name and total pedestrian count, enhancing usability for exploration and analysis.

In [46]:
# Create a base map centered on the average coordinates (Melbourne)
m = folium.Map(location=[sensor_count['Latitude'].mean(), sensor_count['Longitude'].mean()], zoom_start=15)

# Initialize MarkerCluster for better performance when there are many markers
marker_cluster = MarkerCluster().add_to(m)

# Add bubble markers with size based on the count
for _, row in sensor_count.iterrows():
    # Set the bubble size directly based on the count
    bubble_size = row['Total_of_Directions'] / 500000 # Divide by 1 million for readability
    
    folium.CircleMarker(
        location=[row['Latitude'], row['Longitude']],
        radius=bubble_size,  # Size based directly on count
        color="blue",  # Color can be dynamic based on intensity
        fill=True,
        fill_opacity=0.6,
        fill_color="blue",  # You can adjust this to a gradient for more color intensity
        popup=f"<b>Sensor Name:</b> {row['Sensor_Name']}<br><b>Total of Directions:</b> {row['Total_of_Directions']}"
    ).add_to(marker_cluster)

# Display the map
m
Out[46]:
Make this Notebook Trusted to load map: File -> Trust Notebook

🧑🏻‍🎨 Public Artworks, Fountains and Monuments


This dataset provides information about public artworks, fountains and monuments located across the City of Melbourne. Each entry includes details such as the artwork's name, artist (where known), year of creation, material or structure type, and specific address or location. Coordinates are provided in both geographic (latitude and longitude) and projected (Easting and Northing) formats, allowing for spatial analysis and visualisation.

Additional context includes alternate names, authorship, and original data sources such as aerial imagery or field surveys.

Import Artworks, Fountains and Monuments dataset¶

In [49]:
places = pd.read_csv("Datasets/public-artworks-fountains-and-monuments.csv")
places.head(2)
Out[49]:
Asset Type Name Xorg Xsource Address Point Artist Alternate Name Art Date Mel way Ref Respective Author Structure Co-ordinates Easting Northing
0 Art Port Phillip Monument City of Melbourne MCC - Ortho Image March 2005 - Final 178 Sims Street, WEST MELBOURNE unknown NaN 1941 2S_K11 City Of Melbourne Basalt monument -37.8056957854241, 144.907291041632 315771.745 5813680.208
1 Art Bird Panels City of Melbourne MCC - Ortho Image March 2005 - Final 76 Canning Street Di Christensen and Bernice McPherson NaN 1995 2A_E5 City Of Melbourne Stainless-steel panels -37.7953526839703, 144.940687314302 318686.757 5814893.278

Exploratory Data Analysis For Artworks, Fountains and Monuments¶

Pie Chart: Different Types of Assests¶

The dataset provides the frequency count of each asset type, which is visualised using a pie chart. The chart represents the distribution of various asset types (such as public artworks, monuments, and fountains) in the City of Melbourne. Each asset type's proportion is displayed in percentages, making it easy to assess the relative abundance of each type.

Distribution of Asset Types:

  • Artworks: 74.3%
  • Monuments: 18%
  • Fountains: 7.7%
In [52]:
# Count frequency of each Asset Type
asset_counts = places['Asset Type'].value_counts()

# Plot pie chart
plt.figure(figsize=(8, 8))
plt.pie(asset_counts, labels=asset_counts.index, autopct='%1.1f%%', startangle=140, colors=plt.cm.Paired.colors)
plt.title('Distribution of Asset Types')
plt.axis('equal')  # Equal aspect ratio ensures the pie chart is circular.
plt.show()
No description has been provided for this image
Bar Chart: Amount of Artworks Maneged by Different Organizations¶

The dataset represents the frequency count of artworks by the organisation (denoted by the 'Xorg' column) in the City of Melbourne. The data is visualised through a bar chart, which illustrates the number of artworks associated with each organisation. The bars show the distribution of artworks across various organisations, making it easier to see which organisations have the most public art installations.

The x-axis represents the organisations, while the y-axis indicates the number of artworks contributed by each.

Number of Artworks by Organisation (Xorg):

  • City of Melbourne: 199 artworks
  • VicUrban: 38 artwork
  • Beveridge Williams Surveyors: 15 artwork
In [54]:
# Count frequency of each Xorg
xorg_counts = places['Xorg'].value_counts()

# Plot bar chart
plt.figure(figsize=(10, 6))
ax = xorg_counts.plot(kind='bar', color='skyblue', edgecolor='black')

# Adding text on top of each bar to show the count
for i in ax.patches:
    ax.annotate(f'{i.get_height()}', 
                (i.get_x() + i.get_width() / 2, i.get_height()), 
                xytext=(0, 5), 
                textcoords='offset points', 
                ha='center', 
                va='bottom', 
                fontsize=10, 
                color='black')

plt.title('Number of Artworks by Xorg')
plt.xlabel('Xorg')
plt.ylabel('Number of Artworks')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
No description has been provided for this image
Word Cloud: Most Common Artists¶
This visualisation presents a word cloud representing the artists of various public artworks. The word cloud was generated by combining all the artist names from the dataset, excluding any missing values (NaN). Larger and bolder words indicate artists who appear more frequently in the dataset.
In [56]:
# Drop NaN values from 'Artist' column
artists = places['Artist'].dropna()

# Combine all artist names into a single string
artist_text = " ".join(artists)

# Generate the word cloud
wordcloud = WordCloud(width=800, height=400, background_color='white', colormap='viridis').generate(artist_text)

# Plot the word cloud
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Word Cloud of Artists')
plt.tight_layout()
plt.show()
No description has been provided for this image
Map: Map Visualization of each Artworks, Fountains and Monuments¶

This interactive map displays the locations of various public artworks, monuments, sculptures, and panels throughout Melbourne. Each marker on the map represents an asset and is colour-coded based on its type:

  • Blue markers represent artworks.
  • Green markers represent monuments.
  • Purple markers represent sculptures.
  • Orange markers represent panels.

By clicking on a marker, a detailed tooltip appears, providing information about the asset, including its type, name, associated organisation, artist, and the year it was created. This map serves as an informative guide to the diverse range of public art in Melbourne, offering users the opportunity to explore and learn more about these significant cultural lndmarks.

In [58]:
# Convert 'Co-ordinates' column to separate latitude and longitude
places[['Latitude', 'Longitude']] = places['Co-ordinates'].str.split(',', expand=True)
places['Latitude'] = places['Latitude'].astype(float)
places['Longitude'] = places['Longitude'].astype(float)

# Define color mapping for different Asset Types
asset_colors = {
    "Art": "blue",
    "Monument": "green",
    "Sculpture": "purple",
    "Panel": "orange"
}

# Create a base map centered on Melbourne
m = folium.Map(location=[-37.81, 144.96], zoom_start=13)

# Add markers with detailed tooltip
for _, row in places.iterrows():
    asset_type = row['Asset Type']
    color = asset_colors.get(asset_type, "gray")  # Default color if type is missing

    # Construct the tooltip with bold labels
    tooltip = f"""
    <b>Asset Type:</b> {row['Asset Type']}<br>
    <b>Name:</b> {row['Name']}<br>
    <b>Organization:</b> {row['Xorg']}<br>
    <b>Artist:</b> {row['Artist']}<br>
    <b>Year:</b> {row['Art Date']}
    """

    folium.Marker(
        location=[row['Latitude'], row['Longitude']],
        popup=row['Name'],
        tooltip=tooltip,
        icon=folium.Icon(color=color)
    ).add_to(m)

# Display the map
m
Out[58]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Mapping the Train Routes


Below we visualize Melbourne's train routes on an interactive map. It processes data stored in the result_train DataFrame, where each train route (train_id) is uniquely colored using visually distinct palettes.

In [60]:
import folium
import pandas as pd
from branca.element import Template, MacroElement
import matplotlib.colors as mcolors
import random

# Create map centered on average coordinates
map_center = [result_train['stop_lat'].mean(), result_train['stop_lon'].mean()]
m = folium.Map(location=map_center, zoom_start=12, tiles='CartoDB positron')

# Generate distinct colors for each train_id
def generate_distinct_colors(n):
    """Generate n visually distinct colors"""
    colors = []
    # Start with a set of good distinct colors
    base_colors = list(mcolors.TABLEAU_COLORS.values())  # Tableau palette
    base_colors.extend(['#FF00FF', '#00FFFF', '#FFA500', '#800080', '#008080'])
    
    if n <= len(base_colors):
        return base_colors[:n]
    
    # If we need more colors than we have base colors, generate random but distinct ones
    for _ in range(n - len(base_colors)):
        # Generate random but reasonably distinct colors
        h = random.random()
        s = 0.7 + random.random() * 0.3
        v = 0.6 + random.random() * 0.3
        rgb = mcolors.hsv_to_rgb([h, s, v])
        colors.append(mcolors.to_hex(rgb))
    
    return base_colors + colors

# Get unique train_ids and assign colors
unique_train_ids = result_train['train_id'].unique()
colors = generate_distinct_colors(len(unique_train_ids))
color_dict = dict(zip(unique_train_ids, colors))

# Add each train route separately
for train_id, group in result_train.groupby('train_id'):
    color = color_dict[train_id]
    
    # Create feature group for this route
    route_name = group['route_short_name'].iloc[0] if 'route_short_name' in group.columns else train_id
    fg = folium.FeatureGroup(name=f"{route_name} (Train ID: {train_id})")
    
    # Add line for the route
    line = folium.PolyLine(
        locations=group[['stop_lat', 'stop_lon']].values,
        color=color,
        weight=5,
        opacity=0.7,
        tooltip=f"<b>{route_name}</b><br>Train ID: {train_id}"
    )
    fg.add_child(line)
    
    # Add markers for each stop
    for idx, row in group.iterrows():
        stop_name = row['stop_name'] if 'stop_name' in row else f"Stop {row['stop_id']}"
        stop_id = row['stop_id'] if 'stop_id' in row else "N/A"
        
        marker = folium.CircleMarker(
            location=[row['stop_lat'], row['stop_lon']],
            radius=6,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=1,
            tooltip=f"""
            <div style='width: 200px'>
                <b>Route:</b> {route_name}<br>
                <b>Train ID:</b> {train_id}<br>
                <b>Stop:</b> {stop_name}<br>
                <b>Stop ID:</b> {stop_id}
            </div>
            """
        )
        fg.add_child(marker)
    
    m.add_child(fg)

# Add layer control to toggle routes
folium.LayerControl().add_to(m)

# Custom CSS template for better tooltips
template = """
{% macro html(this, kwargs) %}
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Train Routes</title>
  <style>
    .folium-tooltip {
        font-size: 14px;
        line-height: 1.4;
        max-width: 250px;
    }
    .folium-tooltip table {
        border-collapse: collapse;
    }
    .folium-tooltip th, .folium-tooltip td {
        padding: 2px 5px;
        text-align: left;
    }
    .folium-tooltip hr {
        margin: 5px 0;
        border: 0;
        border-top: 1px solid #eee;
    }
  </style>
</head>
</html>
{% endmacro %}
"""

macro = MacroElement()
macro._template = Template(template)
m.get_root().add_child(macro)

# Display the map
m
Out[60]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Mapping the Bus Routes


This map visualizes bus routes in Melbourne using Folium and Pandas. It takes data from the result_bus DataFrame, which contains details like bus numbers, stop names, and geographic coordinates. The map is centered based on the average stop locations.
In [62]:
import folium
import pandas as pd
from branca.element import Template, MacroElement
import matplotlib.colors as mcolors
import random

# Create map centered on average coordinates
map_center = [result_bus['stop_lat'].mean(), result_bus['stop_lon'].mean()]
m = folium.Map(location=map_center, zoom_start=12, tiles='CartoDB positron')

# Generate distinct colors for each train_id
def generate_distinct_colors(n):
    """Generate n visually distinct colors"""
    colors = []
    # Start with a set of good distinct colors
    base_colors = list(mcolors.TABLEAU_COLORS.values())  # Tableau palette
    base_colors.extend(['#FF00FF', '#00FFFF', '#FFA500', '#800080', '#008080'])
    
    if n <= len(base_colors):
        return base_colors[:n]
    
    # If we need more colors than we have base colors, generate random but distinct ones
    for _ in range(n - len(base_colors)):
        # Generate random but reasonably distinct colors
        h = random.random()
        s = 0.7 + random.random() * 0.3
        v = 0.6 + random.random() * 0.3
        rgb = mcolors.hsv_to_rgb([h, s, v])
        colors.append(mcolors.to_hex(rgb))
    
    return base_colors + colors

# Get unique train_ids and assign colors
unique_bus_numbers = result_bus['bus_number'].unique()
colors = generate_distinct_colors(len(unique_bus_numbers))
color_dict = dict(zip(unique_bus_numbers, colors))

# Add each train route separately
for bus_number, group in result_bus.groupby('bus_number'):
    color = color_dict[bus_number]
    
    # Create feature group for this route
    route_name = group['route_long_name'].iloc[0] if 'route_long_name' in group.columns else bus_number
    fg = folium.FeatureGroup(name=f"{route_name} (Bus Number: {bus_number})")
    
    # Add line for the route
    line = folium.PolyLine(
        locations=group[['stop_lat', 'stop_lon']].values,
        color=color,
        weight=5,
        opacity=0.7,
        tooltip=f"<b>{route_name}</b><br>Bus Number: {bus_number}"
    )
    fg.add_child(line)
    
    # Add markers for each stop
    for idx, row in group.iterrows():
        stop_name = row['stop_name'] if 'stop_name' in row else f"Stop {row['stop_id']}"
        stop_id = row['stop_id'] if 'stop_id' in row else "N/A"
        
        marker = folium.CircleMarker(
            location=[row['stop_lat'], row['stop_lon']],
            radius=6,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=1,
            tooltip=f"""
            <div style='width: 200px'>
                <b>Route:</b> {route_name}<br>
                <b>Bus Number:</b> {bus_number}<br>
                <b>Stop:</b> {stop_name}<br>
                <b>Stop ID:</b> {stop_id}
            </div>
            """
        )
        fg.add_child(marker)
    
    m.add_child(fg)

# Add layer control to toggle routes
folium.LayerControl().add_to(m)

# Custom CSS template for better tooltips
template = """
{% macro html(this, kwargs) %}
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Train Routes</title>
  <style>
    .folium-tooltip {
        font-size: 14px;
        line-height: 1.4;
        max-width: 250px;
    }
    .folium-tooltip table {
        border-collapse: collapse;
    }
    .folium-tooltip th, .folium-tooltip td {
        padding: 2px 5px;
        text-align: left;
    }
    .folium-tooltip hr {
        margin: 5px 0;
        border: 0;
        border-top: 1px solid #eee;
    }
  </style>
</head>
</html>
{% endmacro %}
"""

macro = MacroElement()
macro._template = Template(template)
m.get_root().add_child(macro)

# Display the map
m
Out[62]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Mapping the Tram Routes


Below interactive map that visualizes tram routes across Melbourne using the Folium library. It reads tram stop and route data from the result_tram DataFrame, which includes details such as tram numbers, stop names, and geographic coordinates.
In [64]:
import folium
import pandas as pd
from branca.element import Template, MacroElement
import matplotlib.colors as mcolors
import random

# Create map centered on average coordinates
map_center = [result_tram['stop_lat'].mean(), result_tram['stop_lon'].mean()]
m = folium.Map(location=map_center, zoom_start=12, tiles='CartoDB positron')

# Generate distinct colors for each train_id
def generate_distinct_colors(n):
    """Generate n visually distinct colors"""
    colors = []
    # Start with a set of good distinct colors
    base_colors = list(mcolors.TABLEAU_COLORS.values())  # Tableau palette
    base_colors.extend(['#FF00FF', '#00FFFF', '#FFA500', '#800080', '#008080'])
    
    if n <= len(base_colors):
        return base_colors[:n]
    
    # If we need more colors than we have base colors, generate random but distinct ones
    for _ in range(n - len(base_colors)):
        # Generate random but reasonably distinct colors
        h = random.random()
        s = 0.7 + random.random() * 0.3
        v = 0.6 + random.random() * 0.3
        rgb = mcolors.hsv_to_rgb([h, s, v])
        colors.append(mcolors.to_hex(rgb))
    
    return base_colors + colors

# Get unique train_ids and assign colors
unique_tram_numbers = result_tram['tram_number'].unique()
colors = generate_distinct_colors(len(unique_tram_numbers))
color_dict = dict(zip(unique_tram_numbers, colors))

# Add each train route separately
for tram_number, group in result_tram.groupby('tram_number'):
    color = color_dict[tram_number]
    
    # Create feature group for this route
    route_name = group['route_long_name'].iloc[0] if 'route_long_name' in group.columns else tram_number
    fg = folium.FeatureGroup(name=f"{route_name} (Tram Number: {tram_number})")
    
    # Add line for the route
    line = folium.PolyLine(
        locations=group[['stop_lat', 'stop_lon']].values,
        color=color,
        weight=5,
        opacity=0.7,
        tooltip=f"<b>{route_name}</b><br>Tram Number: {tram_number}"
    )
    fg.add_child(line)
    
    # Add markers for each stop
    for idx, row in group.iterrows():
        stop_name = row['stop_name'] if 'stop_name' in row else f"Stop {row['stop_id']}"
        stop_id = row['stop_id'] if 'stop_id' in row else "N/A"
        
        marker = folium.CircleMarker(
            location=[row['stop_lat'], row['stop_lon']],
            radius=6,
            color=color,
            fill=True,
            fill_color=color,
            fill_opacity=1,
            tooltip=f"""
            <div style='width: 200px'>
                <b>Route:</b> {route_name}<br>
                <b>Tram Number:</b> {tram_number}<br>
                <b>Stop:</b> {stop_name}<br>
                <b>Stop ID:</b> {stop_id}
            </div>
            """
        )
        fg.add_child(marker)
    
    m.add_child(fg)

# Add layer control to toggle routes
folium.LayerControl().add_to(m)

# Custom CSS template for better tooltips
template = """
{% macro html(this, kwargs) %}
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Train Routes</title>
  <style>
    .folium-tooltip {
        font-size: 14px;
        line-height: 1.4;
        max-width: 250px;
    }
    .folium-tooltip table {
        border-collapse: collapse;
    }
    .folium-tooltip th, .folium-tooltip td {
        padding: 2px 5px;
        text-align: left;
    }
    .folium-tooltip hr {
        margin: 5px 0;
        border: 0;
        border-top: 1px solid #eee;
    }
  </style>
</head>
</html>
{% endmacro %}
"""

macro = MacroElement()
macro._template = Template(template)
m.get_root().add_child(macro)

# Display the map
m
Out[64]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Nearest Stops to the Tourist Attarction Areas


This identifies the nearest public transport stops (train, tram, and bus) to a set of tourist attraction areas represented in the places DataFrame (renamed here as artworks_df). Each attraction has geographic coordinates, and the script uses the geopy library to calculate the geodesic distance between each attraction and all nearby stops.
In [66]:
import pandas as pd
from geopy.distance import geodesic

artworks_df = places.copy()

# Extract lat/lon from 'Co-ordinates'
artworks_df[['lat', 'lon']] = artworks_df['Co-ordinates'].str.split(',', expand=True).astype(float)

train_df =result_train.copy()

bus_df = result_bus

tram_df = result_tram

# Find nearest stop
def find_nearest_stop(artwork_row, stops_df, mode):
    min_distance = float('inf')
    nearest_info = {}
    artwork_point = (artwork_row['lat'], artwork_row['lon'])
    
    for _, stop in stops_df.iterrows():
        stop_point = (stop['stop_lat'], stop['stop_lon'])
        distance = geodesic(artwork_point, stop_point).meters
        if distance < min_distance:
            min_distance = distance
            if mode == 'train':
                route = stop['route_short_name']
                stop_name = stop['stop_name']
            elif mode == 'bus':
                route = stop['bus_number']
                stop_name = stop['stop_name']
            elif mode == 'tram':
                route = stop['tram_number']
                stop_name = stop['stop_name']
            nearest_info = {
                f'{mode}_route': route,
                f'{mode}_stop_name': stop_name,
                f'{mode}_distance_m': round(distance, 2)
            }
    return pd.Series(nearest_info)
In [67]:
artworks_df = artworks_df.join(
    artworks_df.apply(lambda row: find_nearest_stop(row, train_df, 'train'), axis=1)
)
In [68]:
artworks_df = artworks_df.join(
    artworks_df.apply(lambda row: find_nearest_stop(row, tram_df, 'tram'), axis=1)
)
In [69]:
artworks_df = artworks_df.join(
    artworks_df.apply(lambda row: find_nearest_stop(row, bus_df, 'bus'), axis=1)
)
In [70]:
artworks_df.head(10)
Out[70]:
Asset Type Name Xorg Xsource Address Point Artist Alternate Name Art Date Mel way Ref Respective Author ... lon train_route train_stop_name train_distance_m tram_route tram_stop_name tram_distance_m bus_route bus_stop_name bus_distance_m
0 Art Port Phillip Monument City of Melbourne MCC - Ortho Image March 2005 - Final 178 Sims Street, WEST MELBOURNE unknown NaN 1941 2S_K11 City Of Melbourne ... 144.907291 Werribee Footscray Station 613.56 82 64-Footscray Station/Leeds St (Footscray) 738.72 220 Whitehall St/Napier St (Footscray) 233.27
1 Art Bird Panels City of Melbourne MCC - Ortho Image March 2005 - Final 76 Canning Street Di Christensen and Bernice McPherson NaN 1995 2A_E5 City Of Melbourne ... 144.940687 Upfield Macaulay Station 409.63 57 19-Abbotsford St Interchange/Abbotsford St (No... 581.13 402 Melrose St/Canning St (North Melbourne) 74.70
2 Art Blowhole Beveridge Williams Surveyors Field Survey 15 Harbour Esplanade, DOCKLANDS Duncan Stemler NaN 2005 2E_G8 City of Melbourne ... 144.946765 Flemington Racecourse Southern Cross Station 563.26 35 D4-Docklands Park/Harbour Esp (Docklands) 67.95 237 Collins Square/Batmans Hill Dr (Docklands) 223.64
3 Art Threaded Field VicUrban (Docklands) AAM Hatch Photogrammetry - March 2005 687 La Trobe Street, DOCKLANDS Simon Perry NaN 1999 2E_G4 VicUrban ... 144.946829 City Circle Southern Cross Station 547.21 86 D1-Docklands Stadium/La Trobe St (Docklands) 106.70 220 Festival Hall/Dudley St (West Melbourne) 500.40
4 Art Threaded Field VicUrban (Docklands) MCC - Ortho Image March 2005 - Final 157 Wurundjeri Way, DOCKLANDS Simon Perry NaN 1999 2E_G4 VicUrban ... 144.947772 City Circle Southern Cross Station 496.48 86 D1-Docklands Stadium/La Trobe St (Docklands) 122.34 216 Jeffcott St/Spencer St (West Melbourne) 472.98
5 Art Blowhole Beveridge Williams Surveyors Field Survey 15 Harbour Esplanade, DOCKLANDS Duncan Stemler NaN 2005 2E_G8 City of Melbourne ... 144.946799 Flemington Racecourse Southern Cross Station 562.41 35 D4-Docklands Park/Harbour Esp (Docklands) 65.89 237 Collins Square/Batmans Hill Dr (Docklands) 221.81
6 Art Cow Up A Tree AAMHatch AAM Hatch Photogrammetry - March 2005 135 Harbour Esplanade, DOCKLANDS John Kelly NaN 2000 2E,_G5 City Of Melbourne ... 144.945280 Flemington Racecourse Southern Cross Station 592.17 35 D3-Stadium Precinct - Bourke St/Harbour Esp (D... 70.84 232 Collins Square/Collins St (Docklands) 499.48
7 Art Coat of Arms City of Melbourne MCC - Ortho Image March 2005 - Final 100 Swanston Street, MELBOURNE City of Melbourne NaN 1992 2F_F4 City Of Melbourne ... 144.966479 Lilydale Flinders Street Station 344.29 109 6-Melbourne Town Hall/Collins St (Melbourne City) 47.25 251 Melbourne Central/Lonsdale St (Melbourne City) 414.91
8 Art Federation Bells City of Melbourne MCC - Ground Ortho Image March 2008 Birrarung Marr, MELBOURNE Designers Neil McLachlan and Anton Hasell NaN 2002 2F_ K6 City of Melbourne ... 144.974111 Cranbourne Flinders Street Station 633.55 70 7A-William Barak Bridge/Melbourne Park (Melbou... 194.29 605 Southbank Theatre/Southbank Bvd (Southbank) 726.99
9 Monument Thomas Ferguson Memorial Drinking Fountain City of Melbourne MCC - Ortho Image March 2005 - Final 179 Leicester Street, CARLTON unknown NaN 1912 2B_C9 City Of Melbourne ... 144.960227 Cranbourne Melbourne Central Station 945.84 19 9-Pelham St/Elizabeth St (Melbourne City) 235.75 402 Melbourne University/Grattan St (Carlton) 164.13

10 rows × 27 columns

In [71]:
artworks_df.to_csv('transport_distance.csv', index=False)

Closest Sensor With Highest Pedesrian Count For Each Artwork


This identifies the closest pedestrian sensor to each tourist artwork and ranks them based on pedestrian activity. It provides a clear view of which public art locations are near high-footfall areas, helping assess visibility and accessibility of these attractions.
In [73]:
import pandas as pd
import numpy as np

# Define the sensor DataFrame
sensor_df = sensor_count.copy()

# Define the artworks DataFrame
artworks_df = places.copy()

# Haversine function to calculate distance between two lat/lon points
def haversine(lat1, lon1, lat2, lon2):
    R = 6371  # Earth radius in km
    lat1, lon1, lat2, lon2 = map(np.radians, [lat1, lon1, lat2, lon2])
    dlat = lat2 - lat1
    dlon = lon2 - lon1
    a = np.sin(dlat / 2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2)**2
    return 2 * R * np.arcsin(np.sqrt(a))

# Find the closest sensor for each artwork
closest_sensors = []
for _, art in artworks_df.iterrows():
    distances = haversine(art['Latitude'], art['Longitude'], sensor_df['Latitude'], sensor_df['Longitude'])
    min_idx = distances.idxmin()
    closest_sensors.append({
        'Artwork Name': art['Name'],
        'Closest Sensor': sensor_df.loc[min_idx, 'Sensor_Name'],
        'Distance_km': distances[min_idx]
    })
In [74]:
# Convert result to DataFrame
result_df = pd.DataFrame(closest_sensors)

# Drop exact duplicate rows
result_df = result_df.drop_duplicates().reset_index(drop=True)

result_df = result_df.merge(sensor_df[['Sensor_Name', 'Total_of_Directions']], 
                            left_on='Closest Sensor', 
                            right_on='Sensor_Name', 
                            how='left')

# Optionally drop the redundant Sensor_Name column
result_df.drop(columns=['Sensor_Name'], inplace=True)

# Sort by distance in descending order
result_df = result_df.sort_values(by='Total_of_Directions', ascending=False).reset_index(drop=True)
result_df
Out[74]:
Artwork Name Closest Sensor Distance_km Total_of_Directions
0 Fault Line SouthB_T 0.130639 43179484
1 Painted Poles Swa31 0.047437 42426049
2 Captain Matthew Flinders Statue Swa31 0.060307 42426049
3 Beyond the Ocean of Existence Swa31 0.032577 42426049
4 Resting Place QVN_T 0.039533 31742404
... ... ... ... ...
256 Scar - A Stolen Vision EntPark1671_T 0.023444 78040
257 Hotham Hill Paving Inlay Mac330_T 0.446446 6962
258 Clayton Reserve Drinking Fountain Mac330_T 0.328262 6962
259 Bird Panels Mac330_T 0.474982 6962
260 Hotham Hill Seat Mac330_T 0.442644 6962

261 rows × 4 columns

In [75]:
result_df.to_csv('pedestrian_distance.csv', index=False)

🔶 1. Public Artworks Are Strategically Located Near Transport Hubs
All artworks listed in the transport dataset are within ~200–1100 meters of at least one train, tram, or bus stop.

For example:

  • Speakers Corner is just ~217 meters from a tram stop and ~795 meters from Flinders Street Station.

  • Grotto Waterfall is within ~533 meters of a tram and ~1106 meters from Richmond Station.

Conclusion: Public artworks are placed in well-connected urban areas, ensuring high accessibility via public transport. This supports cultural exposure and encourages engagement from commuters and tourists alike.

🔷 2. High Pedestrian Activity Around Artworks
Artworks such as Fault Line, Painted Poles, and Captain Matthew Flinders Statue are within 50–130 meters of pedestrian sensors.

These sensors register very high foot traffic, e.g.:

  • SouthB_T (near Fault Line): 43.2 million+ pedestrian counts.

  • Swa31 (near multiple artworks): 42.4 million+ pedestrian counts.

Conclusion: Public artworks are co-located with Melbourne’s busiest pedestrian areas, amplifying visibility and interaction. This aligns with city planning strategies to embed culture into everyday life.

In [ ]: